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LOCAL TAIL BOUNDS FOR FUNCTIONS OF INDEPENDENT 
RANDOM VARIABLES 

By Luc Devroye^ and Gabor Lugosi^ 
McGill University and ICREA and Pompeu Fabra University 

It is shown that functions defined on {0,1,..., r — 1}" satisfy- 
ing certain conditions of bounded differences that guarantee sub- 
Gaussian tail behavior also satisfy a much stronger "local" sub- Gaussian 
property. For self-bounding and configuration functions we derive 
analogous locally subexponential behavior. The key tool is Tala- 
grand's [Ann. Probab. 22 (1994) 1576-1587] variance inequality for 
functions defined on the binary hypercube which we extend to func- 
tions of uniformly distributed random variables defined on {0, 1, . . . , r — 
1}" for r > 2. 

1. Introduction. Concentration inequalities for functions of independent 
random variables establish upper bounds for the tail probabilities of such 
functions under general "smoothness" conditions; see, for example, Tala- 
grand [30, 31, 32], Ledoux [19, 20], Boucheron, Lugosi, Massart [7, 8], Mc- 
Diarmid [23], and so on. In this paper we take a closer look at the distri- 
bution of certain functions of independent random variables and show that 
the tail distribution exhibits a sub-Gaussian (or subexponential) behavior 
in a stronger "local" sense in many cases when concentration inequalities 
predict a sub-Gaussian (subexponential) tail. 

First we consider real-valued functions defined on the binary hypercube 
/ : {0, l}" — > M. If X = {Xi, . . . , Xn) is uniformly distributed on the hyper- 
cube, we are interested in the distribution of the random variable f{X). 
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Our starting point is the following inequality, due to Talagrand [29]: 

Var(/) < A ± WiX)-fiXi^^)r 

10 ^=l 1 + log(y'E(/(X)-/(X«))2)/(E|/(X) - 

(1.1) 

where X^^^ = (Xi , . . . ,1 — Xi, . . . , Xn) is obtained by flipping the ith bit of X 
and Var(/) denotes the variance of the random variable f{X). The constants 
shown here follow from a simple proof by Benjamini, Kalai and Schramm 
[5]. 

Note that (apart from numerical constants) Talagrand 's inequality im- 
proves upon the well-known Efron-Stein inequality (see Efron and Stein 
[11], Rhee and Talagrand [27], Steele [28]): 

n 

Var(/)<i^E(/(X)-/(X«))2. 

i=l 

In Section 2 we show how to use Talagrand's inequality to prove "local" sub- 
Gaussian concentration inequalities. As a simple example, we show that if 
/ : {0, 1}" — > R is such that there exists a constant v such that ^i=i{f {x) — 
/(x«))^ < V, then for aU /c = 1, 2, 3, . . . , 

a/c+i -ak<cy 

where denotes a 1 — 2~'^ quantile of f{X) and c is a universal constant. 
The main argument is based on an observation of Benjamini, Kalai and 
Schramm [5] who show how Talagrand's inequality may be used to obtain 
exponential concentration inequalities. Even though Benjamini, Kalai and 
Schramm do not mention the possibility of deriving local concentration in- 
equalities, it is their argument which is at the basis of our proofs. The 
purpose of this paper is to elaborate on this argument and to derive local 
concentration inequalities under different conditions. In Sections 3, 4 and 5 
various variants and extensions are introduced. In Section 3 local concen- 
tration inequalities are shown under different conditions that are satisfied 
for numerous natural examples such as configuration functions introduced 
by Talagrand [30] — for self-bounding functions, see Boucheron, Lugosi and 
Massart [7], Maurer [22] and McDiarmid and Reed [24]. 

In Section 4, Talagrand's inequality is extended from the binary hyper- 
cube to functions defined on {0, 1, . . . , r — 1}" under the uniform distribution. 
The main technical tool here is a suitable hypercontractive inequality proved 
by Alon, Dinur, Eriedgut and Sudakov [2]. This extension allows us to gener- 
alize the results of Sections 2 and 3 to functions defined on {0, 1, . . . , r — 1}". 

In Section 5 we illustrate the use of the results of Section 4 by considering 
two classical, structurally similar, problems. We derive local concentration 
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inequalities for the cost of the minimum weight spanning tree of a complete 
graph with random uniform weights on the edges and also for the assignment 
problem. 

2. Functions with locally sub-Gaussian behavior. First we consider func- 
tions / : {0, 1}" — > M which satisfy the following properties: for all x = {xi, . . . , 

Xn)e{OAr, 

(2.1) E(/(x)-/(x»))^<^, 

1=1 

where u is a positive constant. [Here and throughout the paper, a+ = max(a, 0) 
and a_ = max(— a, 0) denote the positive and negative parts of the real num- 
ber a.] Clearly, if / is 1-Lipschitz under the Hamming distance, then v <n, 
but there are many interesting examples in which v is significantly smaller 
than n. It is well known (see Ledoux [19], or Boucheron, Lugosi and Massart 
[7]) that for such functions 



(2.2) P{f{X)>Ef{X) + t}<e 



Our basic result (Theorem 2.1) shows that tail quantiles of the random 
variable f{X) are not far apart. In this sense, it is a local tail bound. For 
any a G (0, 1), define the a-quantile of / by 

Qa = mi{z:P{f{X)<z}>a}. 

In particular, we denote the median of f{X) by M/ = Qi/2- 

Theorem 2.1. Assume f satisfies (2.1) and let i? = max^j^j |/(x) — 
/(a;W)|. Then for allb>a> Mf , 



b-a< 



< 



{72/b)vP{f{X)e{a,b + B)} 



P{fiX) > b} log(eV(2P{/(X) E (a, 6 + B)})) 



{72/b)vP{f{X) > a} 



P{f{X) > b} log(e2/(2P{/(X) > a})) ■ 



Proof. Define the function ga^b '■ {0, 1}'" ^ M by 

if fix) > 6, 
9a,b(.x) = { f{x), if a < f{x) < b, 
if /(x) < a. 



First observe that 



Var(,,,(X))>^i^^4lM(5_,)2. 
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On the other hand, we may use Talagrand's inequahty to obtain an upper 
bound for the variance of ga,b{X). To this end, observe that 

nga,b{X) - ga,b{X^^)\ 

= 2E(g,,fe(X)-5,,6(X»))+ 

= 2-E.[{ga,b{X)-ga,b{X^^))+lnx)&ub+B)] 



(by the definition of ga^b and B) 



< 2sjB{ga,biX) - ga,b{Xi^^))l^JnfiX) G (a, 6 + B)} 

(by Cauchy-Schwarz) 



'2Biga,biX)-ga,biX^^^))yP{fiX)eia,b + B)}. 
On the other hand, 

n 

J2B{ga,b{X) - gadX^'^)f 
1=1 

n 
i=l 



2E 



l/(X)G(a,b+B) ^{9a,biX) " ffa.fe (^'''^ )) + 
i=l 

< 2vP{ f{X)e{a,b + B)}, 
where in the last step we used the fact that (2.1) imphes that 

n n 

J2(9a,biX) - ga,b{X^'^))l < E(/(^) - fiX^'^))l < 
i=l i=l 



V. 



Combining the lower bound for the variance with the upper bound obtained 
by Talagrand's inequality yields the claim. □ 

To make Theorem 2.1 more transparent, we state a simple corollary for 
quantiles of f{X). Using P{f{X) > < 7 and P{f{X) > Qi^s} > 

Theorem 2.1 implies the following bound for the distance between any two 
quantiles in the upper tail: 



Theorem 2.2. Assume f satisfies (2.1). Then for aU5<j< 1/2, 
Qi-5 — < 



(72/5)^7 



51og(e2/(27))' 
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In particular, by choosing 7 = 2 ^ and 5 = 2 for some integer k>l 

and introducing 

we get 

, , 12 I V , Pv 

^'■'^ "^^^ - - 7!V(^-l)log2 + 2 ^ V F 

Summing over k = l,2,...,m — 1 and using J2T=ii^ ~ 1)"^''^ ^ 
/q™"""^ x"^/^ = 2^/rr^^^, we obtain 

Om < fli + 8^Jv{m - 1), 

recovering (up to a constant factor) the sub-Gaussian concentration inequal- 
ity (2.2) for /. However, Theorem 2.2 shows a sub-Gaussian behavior in 
a significantly stronger sense. If f{X) was a normal random variable with 
variance v, then one would have ak ~ \/2vkTog2 and ak+i — ajt ~ ^/v log2/A;. 
This (up to a constant factor) is precisely of the form of the upper bound 
(2.3) for a general function / satisfying (2.1). Thus, the whole quantile se- 
quence {ofc} is a contraction of that of a normal random variable of variance 
a constant times v. (We say that a sequence {xn} is a contraction of another 
sequence {?/„} if for every n = 1, 2, . . . , \xn+i - Xn\ < \yn+i - yn\-) 

Remark (C). Even though we offer explicit numerical constants in the 
inequalities derived throughout the paper, no optimality of these values is 
claimed. In fact, quite often we sacrifice better constants for convenience in 
the notation or for simpler arguments. 

Example (C). One of the main examples of a function satisfying (2.1) 
is Talagrand's convex distance (Talagrand [30]) defined as follows. Let Ag 
{0, 1}" and define / as 

f{x) = sup inf I Oil 

where x = (xi, . . . ,x„) and y = (yi, . . . ,yn)- Talagrand shows that for any 
set A with P{X £ A} > 1/2, 

P{/(^) > t} < 2e-*'/^ 

(Note that Talagrand's result is true in any product space with product 
measure.) It is shown by Boucheron, Lugosi and Massart [8] that / satisfies 
(2.1) with V = 1. This implies that for all A; = 1, 2, 3, ... , 

4 



6 



L. DEVROYE AND G. LUGOSI 



Example (L). Let f{X) denote the largest eigenvalue of the adjacency 
matrix of a random graph G{m, 1/2) on m vertices such that each edge ap- 
pears with probabihty 1/2. Thus, n = (J^) and Xj = 1 if and only if edge i is 
present in the graph. Fiiredi and Komlos [14] show that f{X) is asymptot- 
ically normally distributed with expectation m/2 and variance 1/2. Alon, 
Krivelevich and Vu [3] show that f{x) satisfies (2.1) with v = A (see also 
Maurer [22]) and conclude that Ofc < M.f{X) + ^32(A; + 2) log 2. Theorem 
2.2 implies the nonasymptotic local sub-Gaussian estimate 

8 

Ofe+i — ak< 

for k = 1,2, . . . . Note that Alon, Krivelevich and Vu [3] also prove a concen- 
tration result for the rth largest inequality of the form < 'M.f{X) + Cr\/%. 
Their argument may be combined with ours to obtain an analogous local 
concentration inequality. 



Example (R) . Another example is a Rademacher average of the form 

n 

f{x) = sup^ai{xi - 1/2), 

where A C M" is a set of vectors a with ||o;|| < 1. It is easy to see that 
condition (2.1) is satisfied with v = 1. 



Remark (A). We note here that Talagrand proved his inequality (1.1) in 
a more general setup in which the components Xj of X are i.i.d. Bernoulli(p) 
random variables for some p£ (0, 1). In this more general case Theorem 2.2 
becomes 



Qi~5 — Ql-y < 



51og(l/(27))logl/(p(l-p)) 



for some constant C. 

One obtains a corollary of a slightly different flavor by choosing, in Theo- 
rem 2.1, a = k and h = k + l for some integer k > M/; Theorem 2.1 implies 
the following local lower bound for the distribution of /: 

Corollary 2.1. Assume f satisfies (2.1). Then for allk> 'Ef + ^/Av\og2, 

qk 5 (A:-E/)2 5 , e2 
+ 1>— ^ ^ + — log — 



Y.i>k+iqi ~ 288 nv 2 



where qu = P{/(X) e[k,k + 1)}. 
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Proof. This follows immediately by noting that, on the one hand by 
Theorem 2.1, for A; > M/, 

i>k \ k+l<i<k+B+l ■^[Qk + l^k+l<i<k+B+l(lt) 

„2 \ -1 



< (72/5)7; E ^0 



\ i>fe+l 

SO that 



e 



2((?fe + E 



9.+ E g.>|exp(-(72/5)^( 



g2 

By the concentration inequality (2.2), for all k > E/, 



gk+ Y: g. = P{/(x)>fc}<e-(^-E/)V4^ 

i>fc+l 



Since M/ < E/ + \/4w log 2, combining the upper and lower bounds implies 
the corollary. □ 

Remark [Monotonicity of the tail). An obvious corollary is that qu+i < 
qk whenever k > E/ + (25/\/5)u. 

In some applications, even though (2.1) is not satisfied, the similar con- 
dition 

n 

(2.4) 

i=l 

holds. For such cases the next analog of Theorem 2.1 is true. The proof is 
omitted as it is a straightforward modification. In Section 5 we present some 
applications of this result. 

Theorem 2.3. Assume f satisfies (2 A) and let B = v!iay.x,i\f{x) — 
f{x^^)\. Then for all b> a>Mf, 

/(72/5).P{/mg(.-B.^}/ ^ ~' 



P{f(X)>b} V °2P{/(X)e(a-B,6)) 

In particular, for all 5 <^ < 1/2, by taking a = Qi^^ + B and b = Qis, we 
have 



- Qi-7 < -B + 



/ (72/5)^7 
51og(e2/(27))' 
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3. Configuration functions. In this section we consider functions defined 
on the binary hypercube. Just as in Section 1, let / : {0, 1}" M and assume 
that X is uniformly distributed over {0, 1}". 

Often, the sum of the squared changes appearing in condition (2.1) cannot 
be bounded by a constant but it can be related to the value of the function 
itself. Consider the following conditions: 

\f{x)-f{x^'^^)\<B for ah X and z and 
^(/(x)-/(x«))2^<0(/(x)), 

i=l 

where is a fixed nonnegative nondecreasing function defined on the reals. 
In many applications, such as for configuration functions, one may take (p 
to be the identity and in some others it has the form 0(u) = au + b (see 
Talagrand [30], Boucheron, Lugosi and Massart [7, 8] and Devroye [9]). For 
example, it is shown by Boucheron, Lugosi and Massart [8] (for various 
extensions see also Maurer [22], McDiarmid and Reed [24]) that if (3.1) is 
satisfied with (f>{u) = u and S < 1, then 

V{f{X) > E/(X) + i} < e-*V(2E/(X)+2V3) 

Boucheron, Lugosi and Massart [8] offer concentration inequalities for the 
case when (j){u) = cu°' for some a G (0, 2). 

A straightforward modification of the proof of Theorem 2.1 yields the 
following: 

Theorem 3.1. Assume f satisfies (3.1) and let B = msiXx,i\f{x) — 
/(x(^))|. Then for all b> a>Mf, 



b-a< 



{72/5)(t>{b + B)P{f{X)>a} 



P{/(X) > b} log(eV(2P{/(X) > a})) ■ 
Also, for all 6 < 1/2, 



Qi-5 — Ql-7 < 



' (72/5)0(Qi,g + ij)7 
51og(e2/(27)) ■ 



In particular, recalling the notation = Qi-2-k, 



l^{ak+i + B) 



Example (Self-bounding functions). In many interesting applications, 
(f) may be taken to be the identity function and B = 1. These functions have 
been called self-bounding; see Boucheron, Lugosi and Massart [7], Maurer 
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[22], McDiarmid and Reed [24]. In general, if 0(n) is linear, then by the 
above-mentioned concentration inequality, for all k > Ef{X), < c/c, and 
therefore 



where c, C are constants. Thus, in this case the quantile sequence {a^} is 
a contraction of that corresponding to an exponentially distributed random 
variable with parameter 0(1), in a similar sense that functions satisfying 
(2.1) had a quantile sequence contracting a Gaussian quantile sequence. 

Example [Longest increasing subsequences). Let now x = (xi, . . . € 
{0, 1, . . . , r — 1}" and define f{x) to be the length of the longest increasing 
subsequence of that is, the largest positive integer m for which 

there exist I <ii < ■ ■ ■ <im <n such that Xi^ < Xi^ < ■ ■ ■ < Xi^. Tracy and 
Widom [33] and Johansson [18] showed that if X is uniformly distributed 
over {0, 1, . . . , r — 1}", then {f{X) — n/r)/ y^2n/r converges, in distribution, 
to a random variable whose distribution depends on r (see also Its, Tracy 
and Widom [16]). In the binary case (i.e., when r = 2), f{x) is the longest 
subsequence of the form 000 ••• 00111 ••• 11, and Theorem 3.1 may read- 
ily be used. It is immediate to see that / satisfies (3.1) with B = I and 
(f){u) = u and therefore Theorem 3.1 implies a nonasymptotic local subexpo- 
nential concentration inequality. [To see why (3.1) is satisfied, fix a maximal 
increasing subsequence in x and observe that (/(x) — /(x^*^))+ = when- 
ever Xi is not in this maximal sequence.] The same inequality holds when 
f{x) = \og2N{x) where N{x) is the number of all increasing subsequences 
of x. The fact that log2 A^(x) satisfies (3.1) with B = 1 and (j){u) = u was 
observed by Boucheron, Lugosi and Massart [7]. If r > 2, one may use the 
results of Section 4 below to obtain analogous bounds. 

Remark [Concentration inequalities). The recursion for the sequence 
{ofc} given by Theorem 3.1 allows one to derive concentration inequalities 
for general functions (p. We illustrate this for the example when (j){u) < cu" 
for some c > and a £ [0,2]. Then Theorem 3.1 implies, after some work, 
that there exist constants C, to such that for t>tQ, 



Ofc+i — ak<C 




if < a < 2, 
if a = 2. 



The case a < 2 has already been dealt with by Boucheron, Lugosi and Mas- 
sart [8], but the a = 2 case seems to be new. 
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4. Functions defined on the r-ary hypercube. The purpose of this sec- 
tion is to extend the results of Theorems 2.1, 2.2 and 2.3 to functions / 
defined on the r-ary cube {0, 1, . . . , r — 1}", equipped with the uniform dis- 
tribution. In order to do this, we need to generahze Talagrand's variance 
inequaUty to this case. In particular, we prove the following: 

Theorem 4.1. Let r >2 be a positive integer and let /:{0,1,..., 
r — 1}" — > M 6e a real-valued function. Suppose X = (Xi, . . . ,Xn) is uni- 
formly distributed on {0, 1, . . . , r — 1}" . For 1 <i <n, 0<j<r— 1 and for 
each X = (xi, . . . , Xn), denote Xij = (xi, . . . , Xj-i, Xj © j, Xj+i, . . . , Xn) where 
© stands for addition modulo r . Writing 

1 '""^ 

AJ{x) = f{x)--J2f{x,,,), 

we have 

where Cr = (9/2)r^ is the constant of Lemma 4.1 below. 

As a consequence, Theorems 2.1, 2.2, 2.3 and 3.1 may now be extended 
to functions defined on {0, 1, . . . , r — 1}" with the only difference that in the 
conditions on /, f{x) — /(x^*)) is replaced by Aif{x) and the upper bounds 
in all four theorems are multiplied by (10/3)\/log Cr- For example, we will 
use the following result in Section 5: 

Corollary 4.1. Assume / : {0, 1, . . . ,r - 1}*^ ^ M is such that there 
exists V > such that 

n 
i=l 

and let B = maxa;^j | Aj/(x)| . Then for all k = 1,2,3, ... , 
Ofc+i - flfc < -B + 14VlogCV 

The proof of Theorem 4.1 is analogous to Talagrand's [29] original argu- 
ment which was based on the Beckner-Bonami hypercontractive inequality 
(see Bonami [6] and Beckner [4]) of Fourier analysis on the binary hyper- 
cube. Here we use an extension of this inequality to functions defined on 
{0, l,...,r — 1}" due to Alon, Dinur, Friedgut and Sudakov [2] which we 
recall below. 




LOCAL TAIL BOUNDS 11 
For any S = {Si, . . . , 5„) G {0, 1, . . . , r — 1}", define the function 

Us{x) =UJ^^'''\ 

wliere lo = e^'^*/'' and {S,x) = J27=i SiXimodr. It is easy to see (see [2]) that 
the Us form an orthonormal basis of the space of complex-valued functions 
defined over {0, 1, . . . , r — 1}". To simplify notation, we will write 

Jf = ^ E /(-) 11/11, = (/f''^' 



xG{0,l,...,r-l}" 



f{S) = / fus 



Denote by 



the Fourier coefficients of / where us stands for the complex conjugate of us- 
A key ingredient of the proof is the following hypercontractive inequality: 

Lemma 4.1 {Alon, Dinur, Friedgut and Sudakov [2]). For any /:{0,1, 
. . . , r — !}"■ M and k = 1, . . . ,n, 



E fiS)us 

S: \S\<k 



\S:\S\<k ) 



1/2 



where Cr = (9/2)r3 



Proof of Theorem 4.1. Writing fij{x) = f{xij), it is easy to see that 
^•(5)=/(5V^^ Thus, 



r-l 



l-^f..(S)-lfiS), if5, = 0, 

and therefore the Fourier coefficients of Aj/ satisfy 

'~\fiS), if5i/0. 
This and Parseval's identity imply that 

2 



Var(/) 



//) =E/(^)^=EE^ 

5^0 «=1 SytO 



\s\ 



where \S\ denotes the number of nonzero components of S and is the 
all- zero vector. 
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Thus, in order to prove the theorem, it suffices to show that for any 
/:{0,l,...,r-ir ^M, 



1 + log 



which is what we do in the remaining part of the proof. Fix k < n and 
observe that 



S;|S|=fc \S:\S\=k 



fiS)us]f 



< 



E f(S)ns 

S: \S\=k 



4/3 



(by Holder) 



<a'f E f(Sf] -ll/IU/s (by Lemma 4.1) 

\s:\S\=k ) 



This imphes 



S: \S\=k 

and we have, for all positive integers m 

S:l<\S\<m 

where K - 



\S\ 



< 



in f-i2k r<2m 
2 "-^r ^ T^^r II ^||2 

fe=i 



4/3 



36^/2-1 • ^'^^ ^^^^ ^^^P used the fact that C,- > 36 and 
therefore Cr^''"*"^V(^ + 1) > (36V2)C2'^/fc. Now we may write 



E 



\s\ 



<K 



S:l<\S\<m 
^2m 



m 



\S\ 



4/3- 



5: |5|>m 
1 



|5| 



m + 



1 E 



S: \S\>m 



< 



1 



m + 1 



i2KC'r\\f\\l/3 + \\ff2)- 



Now we choose m as the largest integer such that C'r™ II /II4/3 — ^^^'^ 
that 



so 



m + 



log(eV3||/||,/||/||,^3) 
logC, 
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and 




The proof is finished by observing that 



//^/^=ii/^/^ii^^^<ii/^/^ir<(ii/iii-ii/iii)^/^ 



by the Cauchy-Schwarz inequaUty, and therefore II/II4/3 < ||/||i • H/lli; which 
is equivalent to 



Remark (Logarithmic Sobolev inequalities). An alternative route, yield- 
ing better numerical constants than Lemma 4.1, would be to use a sharp log- 
arithmic Sobolev inequality of Diaconis and Saloff-Coste ([10], Theorem A.l) 
which implies hypercontractivity by Gross' theorem; see [15]. 

5. Minimum weight spanning tree and the assignment problem. In this 
section we derive local concentration bounds for two classical problems: the 
minimum weight spanning tree and the assignment problem. In these exam- 
ples the random variables of interest are functions of independent random 
variables uniformly distributed in [0,1]. By simple discretization we may 
approximate them by functions defined over {0, 1, . . . ,r — 1}" and use the 
result of the previous section. Since in Corollary 4.1 the dependence on r 
is only logarithmic, we may take r to be quite large (proportional to n in 
these cases) and still obtain meaningful results. 

Concentration inequalities for both cases may be derived, for example, 
by Talagrand's [30] results. In fact, Talagrand works out the case of the 
assignment problem. In order to conveniently use general concentration in- 
equalities, Talagrand uses a truncation argument, a technique we also adopt 
below. Interestingly, the proofs in both examples below are identical and use 
simple structural properties of the function at hand. 

Example (Minimum weight spanning tree). Consider the random vari- 
able Tm defined as the sum of weights on the minimum spanning tree of 
the complete graph Km with independent uniformly distributed (on [0,1]) 
weights Yij (1 < i < j < m) on the edges. A classical result of Frieze [13] 
shows that limm,_^ooETm = C(3)- Janson [17] and Wastlund [36] prove that 
if the edge weights are exponentially distributed with parameter 1, then 
y/rn(Tm — C(3)) converges, in distribution, to a centered normal random 
variable with variance 6^(4) — 4^(3). Here we study the related random 




□ 
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variable obtained when the Yij are replaced by mm(Yij,6m) where 
6m. > is a small positive number. Note that if 5m = clogm/m for some 
c > 1, then Tm = Tm with high probability. In order to see this just ob- 
serve that Tm 7^ Tm implies that the largest edge weight in the minimum 
spanning tree is greater than 6m- But this is just the probability that the 
random graph G{m,6m) is not connected which is at most 2(e"*^^ "''^'^ ~ 1) + 
2m+ij^-{c-i)m/4 ^ggg Erdos and Renyi [12] and Palmer [26]), which is at 
most 4m~'^/^, if c > 2. 

To be able to use Corollary 4.1, we need to approximate Tm by a function 
defined on {0, 1, . . . , r — 1}" under the uniform distribution where n = (J^) . In 
order to do this, we replace the random variables Yij by their "discretized" 
approximation [rYij\/r. If we denote the cost of the minimum spanning tree 
defined by the edge costs min( [rYij\/r, 6m.) by Tm, then clearly \Tm — Tm\ < 
m/r. The random variable Tm may now be considered as a function of n = 
(™) independent variables Xij, all uniformly distributed on {0,l...,r — 1}, 
by defining [ry^jj =Xij. Now we may use Corollary 4.1. Clearly, we may 
take B = 6m- On the other hand, 

{\,,)l<m6l 

l<:i<j<:'m 

and therefore, denoting by the 1 — 2~'^-quantile of Tm., we obtain 



ak+i -a,< 6m + 14y ^y^log(9r3/2). 

This, in turn, implies that if ak denotes the 1 — 2~'^-quantile of Tm, then, 
for all A; = 1, 2, 3, ... , 



ak+i -ak< 2m /r + 6m + 14^ ^^log(9r3/2). 

By choosing, say, r = and 6.m = c\ogm/m for some constant c > 1, we 
obtain 

_ / /log^m logm\ 

afc+i -ak<C\\ \ 

V V mk m ) 

for a constant C depending on c only. This inequality shows a local sub- 
Gaussian behavior whenever k < mlogm. It may be regarded as a local 
nonasymptotic version of the limit theorem of Janson and Wastlund, up 
to the logarithmic factors we needed to give up for technical reasons. For 
larger values of k the second term dominates the first one, which corresponds 
to a subexponential behavior in the far tail. We do not know if this term 
is necessary. In order to convert this into a useful bound for the original 
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problem T^, one needs to choose c so large that the bound P{Tm / Tm} < 
4j^~c/4 (^Qgg j^Q^ dominate 2"'^. Choosing c = max(2, 4(/c + 2) log2/log m), 
one obtains 

a-k+i — «fc-i 



2 4(fc + 2)log2 /3(fc + 2) 9m6 

— + + 561og2W log—-, 

mm ^ m 2 

„ , logm 

<<: ifA; + 2>-^, 

2 log 2 

2 2 log m / log^ m 9m^ „ log m 

- + ^^ + 28\/-^log^, ifA; + 2<-^. 
mm V 2 2 log 2 

In order to compare this local bound to concentration inequalities, note 
that Theorem 7 of Boucheron, Lugosi and Massart [8] implies that P{Tm > 



A;c^ log m log 2 



4(e — l)m 

Again, choosing c = max(2, 4(A; + 2) log 2/ log m), one obtains 



log 2 / fclog^m /4(fc + 2)3 
afc_i<ET„ + J^max ^ ^ ^ 



1 V V V 

By summing the "local" inequality in k, one obtains a concentration in- 
equality that is only slightly weaker than the one derived here, as we get an 
extra factor of y/\ogm. This is due to the approximation by discretization, 
necessary to apply Corollary 4.1. 

Example (The assignment problem). In the assignment problem, given 
an m X m array {lij}mxm of independent random variables distributed 
uniformly on [0, 1], one considers the random quantity 



mm 



where the minimum is taken over all permutations vr of {1, . . . ,m}. Culmi- 
nating a long series of partial results, Aldous [1] shows that limm^oo EZ^ = 
(^(2). In the case when the Yij are exponentially distributed with parameter 
1, Linusson and Wastlund [21] and Nair, Prabhakar and Sharma [25] in- 
dependently prove that for all m, BZm = Y.iLi See also Wastlund [34]. 
Wastlund [35] also derives an explicit formula for the variance of ■ In par- 
ticular, he proves that Var(Zm) = 4(C(2) — ({3))/m + 0(m~^/^). Talagrand 
[30] proves (in the uniform model) an exponential concentration inequality 
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very similar to the one described for the minimum weight spanning tree 
above. 

In fact, in order to get local concentration inequalities, we may proceed 
exactly as we did in the previous example: first we replace the Yij by the 
truncated variables mm{Yij,6m)- If -^m denotes the cost of the optimal 
assignment based on the truncated variables, then Proposition 10.3 of Tala- 
grand [30] implies that there exists a constant K such that P{Zm 7^ Zm} < 
^-mSm/K ^ an inequality that is completely analogous to the one we used 
in the study of the minimum weight spanning tree. Second, we use the 
discretized approximation of the truncated variables. Then just as for the 
minimum weight spanning tree, we may take B = 6m. in Corollary 4.1 and 
observe that 

which leads to inequalities completely analogous to those obtained for the 
minimum weight spanning tree example above. In particular, if denotes 
the 1 — quantile of Zm-, then there exists a constant C such that 

f k Ik log m log m / log^ m \ 

afc+i - ak-i < C max h \ , h \ — . 

\m \ m m y km / 
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